Tight clustering for large datasets with an application to gene expression data
نویسندگان
چکیده
منابع مشابه
IMDC: An Image-Mapped Data Clustering Technique for Large Datasets
In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a...
متن کاملClustering Algorithms: Their Application to Gene Expression Data
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the chall...
متن کاملClustering Large Datasets Using Data Stream Clustering Techniques
Abstract. Unsupervised identification of groups in large data sets is important for many machine learning and knowledge discovery applications. Conventional clustering approaches (kmeans, hierarchical clustering, etc.) typically do not scale well for very large data sets. In recent years, data stream clustering algorithms have been proposed which can deal efficiently with potentially unbounded ...
متن کاملA Method for Tight Clustering: with Application to Microarray
In this paper we propose a method for clustering that produces tight and stable clusters without forcing all points into clusters. Many existing clustering algorithms have been applied in microarray data to search for gene clusters with similar expression patterns. However, none has provided a way to deal with an essential feature of array data: many genes are expressed sporadically and do not ...
متن کاملEfficient Evidence Accumulation Clustering for large datasets/big data
The unprecedented collection and storage of data in electronic format has given rise to an interested in automated analysis for generation of knowledge and new insights. Cluster analysis is a good candidate since it makes as few assumptions about the data as possible. A vast body of work on clustering methods exist, yet, typically, no single method is able to respond to the specificities of all...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Scientific Reports
سال: 2019
ISSN: 2045-2322
DOI: 10.1038/s41598-019-39459-w